The Aberrant Methylation Sites Identification and Function Analysis Associated With DNMT3A And IDH Mutations in AML  

Ci C. , Wang Y.H. , Gu  Y. , Su  J.Z.
College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
Author    Correspondence author
Cancer Genetics and Epigenetics, 2015, Vol. 3, No. 11   doi: 10.5376/cge.2015.03.00011
Received: 17 Aug., 2015    Accepted: 18 Sep., 2015    Published: 27 Oct., 2015
© 2015 BioPublisher Publishing Platform
This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:

Ci C., Wang Y.H., Gu Y.,and Su J.Z., 2015, The Aberrant Methylation Sites Identification and Function Analysis Associated With Dnmt3a And Idh Mutations in Aml Cancer Genetics and Epigenetics, Vol.3, No.11, 1-8 (doi: 10.5376/cge.2015.03.00011)

Abstract

DNA methylation is a major epigenetic modification process. DNA methylation have played an important role in the development of disease cells and normal cells. Mutations in the genetic sequence of DNMT3A and IDH are found in many patients with acute myeloid leukemia. They lead to dysfunction of DNMT3A protein and isocitrate dehydrogenase and bad prognosis. However, the process of how they regulate DNA methylation in acute myeloid leukemia, affecting the development of the disease is not clear. Following work is conducted: In the analysis of survival, we have studied the influence of different mutations on patients’ survival, then we select DNMT3A and IDH genes as the key genes. We use JHU-USC HumanMethylation450K data of 74 AML samples downloaded from The Cancer Genome Atlas (https://tcga-data.nci.nih.gov),together with 40 normal samples downloaded from The Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/),then through QDMR(http://bioinfo.hrbmu.edu.cn/qdmr/) and SAM (SAMR package is used to analyze significance of microarrays) method, 1,991 Differentially methylated sites(DMS) are screened eventually, finally these CpG sites are mapped to 1,452 genes. Outcomes from cluster analysis illustrate that there exist little differences in individuals from normal samples. Disease samples have a higher methylation proportion than normal samples. Then, we match the genome for DMS and discover that the hypermethylation inclines to a lower expression in the promoter, DNA methylation and gene expression in the sample indicate a slightly positive correlation on gene body. Functional enrichment analysis illustrates that differentially methylated genes are mostly enriched in cancer pathway and cell adhesion. This topic is based on DNA methylation to classify samples and do function analysis for DMS. It can provide the diagnosis and therapy of acute myeloid leukemia with great help.

Keywords
Acute myeloid leukemia; DNA methylation; DNMT3A; IDH

Introduction
Acute myeloid leukemia (AML) contains all acute non-lymphocytic leukemia. It is a highly heterogeneous disease [1, 2].With improved treatments, the survival rate and quality of life in patients with AML has improved. Providing appropriate treatment plan by disease risk grade and prognosis analysis of patients is very important. AML is particularly common in adults, the elderly people have higher disease incidence than the young.

DNA methylation have an important role to induce the disease and maintain cellular health lifecycle [3-5]. Individuals with acute myeloid leukemia (AML) tend to suffer significant proportion DNMT3A and IDH mutations, which contributes to DNMT3A protein and isocitrate dehydrogenase dysfunction and their poor prognosis [6, 7].Firstly, we research the effect of common mutations to patient survival and select DNMT3A, IDH gene as a key gene. Next, we use a combination method of SAM and QDMR    (http://bioinfo.hrbmu.edu.cn/qdmr/) to screen DMS and classify sample. Then, we analyze the relationship between DNA methylation and gene expression of different genomic regions. Finally, we do function enrichment analysis for differentially methylated genes. This topic is based on DNA methylation to classify samples and do function analysis for DMS. It can provide the diagnosis and therapy of acute myeloid leukemia with great help.

The aim of the study is that identifying DMS between disease samples and normal samples. Furthermore, we explore that the function of DMS. Epigenetic modifications do not change the gene sequence. Epigenetic modification contains many classes, such as DNA methylation, histone modification [8].Causes of acute myeloid leukemia include changes in hematopoietic stem cell epigenetics, these changes lead to the growth of hematopoietic stem cell proliferation and changes in differentiation occur [6].

DNA methylation is a major epigenetic modification process. DNA methylation have played an important role in the development of disease cells and normal cells [3, 9, 10].Periodic abnormal DNA methylation patterns are commonly observed in cancer cells. It means that epigenetic modification is associated with the formation and development of cancer [4, 5, 11]. During cell division, DNA methylation involves in the process of DNA synthesis new strands.DNMT3A and DNMT3B can help to transfer the methylation process [6].DNA methylation also involves in normal hematopoietic development and it has an important impact on the medullary tumors. Analysis of methylation profiles in the specific genomic location may affect the treatment of individual tumor subtypes [12-14]. CpG island methylation phenomenon (CIMP) is clinically used to distinguish cancer phenotypes CpG island hypermethylation [15, 16]. Recently, some genomic modification also affects the epigenetic regulation, such as IDH1 and IDH2 genes in malignant glioma,Tet2 genes in leukemia [17].

Currently, Mutations in the genetic sequence of DNMT3A are found in many patients with acute myeloid leukemia. They lead to dysfunction of DNMT3A protein and bad prognosis [18-22]. DNMT3A located on chromosome 2p23, is highly expressed in some tumors [6].

DNMT3A mutations are associated with AML poor prognosis, therefore suggests to take into account the use of risk stratification [6].DNMT inhibitors have been used in the treatment of myelodysplastic syndrome and leukemia [6].However, DNMT inhibitors predict biomarker for the management of other treatment modalities is also unknown [6].

The study found that IDH1 and IDH2 mutations occur frequently in AML patients [7]. Isocitrate dehydrogenase is associated with cell metabolism, which can promote isocitrate and α- ketoglutarate transformed into each other in the cell, leading to DNA mutation and regulation of histone [7].However, when the enzyme of gene encoding mutations, IDH will converted α- ketoglutarate to 2-hydroxyglutaric acid [23].Studies have shown that 2-hydroxyglutaric acid can produce antagonism, lower α- ketoglutarate dependent enzyme activity, resulting in chromatin methylation and cancer [7, 24].In addition, the citric acid cycle is essential for many biochemical signaling pathways, one of the important enzyme is isocitrate dehydrogenase [23].

1 Methods
1.1 Datasets
The AML clinical data were downloaded from TCGA (https://tcga-data.nci.nih.gov).The Cancer Genome Atlas (TCGA) is coordinated by a project team comprised of individuals from both the National Cancer Institute and the National Human Genome Research Institute and advised by an External Scientific Committee whose membership includes patient advocates, senior scientists and clinicians with relevant expertise in cancer, genomics and ethics. JHU-USC HumanMethylation450K data of 74 AML samples downloaded from TCGA (42 DNMT3A samples,19 IDH samples and 13 IDH_3AD samples), together with 40 normal samples DNA methylation data downloaded from GEO with accession number of GSE35069;Gene expression data of disease samples were downloaded from TCGA (https://tcga-data.nci.nih.gov) with WUSM HG-U133_Plus_2 and standardized RMA. Gene expression data of control samples were downloaded from GEO (The Gene Expression Omnibus) with accession number of GSE48060 (21 samples).Reference genome from UCSC (This site contains the reference sequence and working draft assemblies for a large collection of genomes. It also provides portals to ENCODE data at UCSC (2003 to 2012) and to the Neandertal project).DNMT3A mutations were recorded as DNMT3A, IDH mutations were recorded as IDH, DNMT3A and IDH mutations in a sample were recorded as IDH_3AD, normal samples were recorded as control. The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community.

1.2 DNA methylation data analysis
Firstly, the samples were divided into four groups with DNMT3A, IDH, IDH_3AD, control. After removing the missing values, we screened DMS during the same mutation and aimed at removing the difference influence between the same sample, in order to make the results more accurate. we use QDMR method based on information entropy to screen DMS. To model the effect of experimental variability, we simulated distribution of entropy from uniformly methylated regions. We computed the fold change between replicate-dependent difference from the average level across replicates and the theoretical maximum range of methylation. The fold change follows a normal distribution with mean equal to zero and some unknown, but 'small', standard deviation (SD) [25].because it is data preprocessing step, we chose a more relaxed threshold (SD = 0.15).Secondly, we removed DMS and intersected the remaining sites. Thirdly, we use QDMR method to screen DMS between mutation samples and normal samples (SD=0.07).The purpose is to find DMS between disease and normal samples. Then we union all CpG sites and deal with DNA methylation profile as follows (X is the original methylation profile which the number of rows is m, Y is the mean matrix. n1 represents the number of DNMT3A mutation samples, n2 represents the number of IDH mutation samples, n3 represents the number of IDH_3AD samples, n4  represents the number of normal samples, m is the number of DMS after taking union):

(1)

(2)

(3)

(4)

(5)


We screen DMS for the mean matrix Y by using SAMR package. SAM is a statistical tool to find significant genes of a set of microarray data, DNA methylation sites which  and
  as DMS. DMS on the gene were mapped to obtain DNA methylation profile.

1.3 Clustering analysis
JHU-USC HumanMethylation450K data downloaded from TCGA (https://tcga-data.nci.nih.gov).
In order to prove the occurrence of acute myeloid leukemia is not only to the genome mutations, but also to the epigenetic changes (DNA methylation),we use differentially methylated profile to do clustering analysis by MeV v4.9 (http://www.tm4.org/mev.html). MeV v4.9 software used to analyze the expression profile data standardization and filtered. It uses a variety of complex algorithms to achieve clustering, visualization, classification, statistical analysis and other functions. Then, using a hierarchical clustering method to connect the average distance matrix and the Pearson correlation coefficient matrix to obtain the clustering heat map. Finally, doing T test statistic analysis for differentially methylated profile by MeV v4.9 software. T test is designed to test the significance difference between cancer samples and normal samples (         ).

1.4 Relationship between DNA methylation and gene expression of different genome regions
In order to analyze the relationship between DNA methylation and gene expression in the different genomic region, we map differentially methylated genes on the two areas: (1) on the gene body. (2) 2kb upstream of the transcription start site. In order to analyze the relationship between DNA methylation and gene expression in the two regions, firstly, for the gene body, we calculate the mean value of DNA methylation and mean value of gene expression of a plurality of samples and make two variables correlation analysis by using SPSS 19.0 version (http://www-01.ibm.com/software/analytics/spss). Then, we separate the DMS in the gene body from disease group and normal group and draw correlation analysis diagram using R. Finally, for the gene promoter region, we calculate the number of hypermethylated genes with low expression and the number of hypermethylated genes with high expression using SAMR package. We choose genes with fold change> 2 or fold change <0.5 as differentially expressed genes. The purpose is to analyze the relationship between DNA methylation and gene expression in the promoter region.

1.5 Functional enrichment analysis
GO functional analysis is to analyze the main biological function using screened differentially methylated genes.
Pathway enrichment analysis is to find metabolic pathways for the differentially methylated genes. Screened differentially methylated genes were used to do functional enrichment analysis and pathway enrichment analysis by DAVID software.

2 Results
2.1 DNA methylation data processing results
42 samples with DNMT3A (excluding IDH mutations), 19 samples with IDH (excluding DNMT3A mutation), 13 samples with IDH_3AD JHU-USC Human- Methylation450K data were used to draw DNA methylation cluster heat map of all disease samples (Figure 1).The row represents sample and the column represents CpG sites. As can be seen from Figure 1, DNA methylation level among the different samples exists difference, thus demonstrating acute myeloid leukemia may relate to DNA methylation. We can dig out the DNA methylation characteristics associated with acute myeloid leukemia to classify patients.
 

 
Figure 1 DNA methylation cluster heatmap of all disease samples(x-axis represents the type of mutation, y-axis represents probe) 


We use JHU-USC HumanMethylation450K data downloaded from TCGA (https://tcga-data.nci.nih.gov), each sample has 485,577 CpG sites, each site corresponds to a methylation value. Firstly, in order to remove the influence of the differences among samples, we screen DMS among the same mutation samples by QDMR method (SD=0.15).The results are as follows(Table 1):
 

 
Table 1 Number of CpG sites in the internal stability samples 


As can be seen from Table 1,compared with disease samples, there is little internal individual variation in 40 normal blood samples. Then we intersected stability CpG sites of the four types of samples to obtain 303,019 CpG sites. We screened DMS between three mutation samples and normal samples using QDMR method(SD=0.07).The results are as follows(Table 2):
 

 
Table 2 The number of DMS 


Then we union DMS of Table 2 and obtain 105,229 CpG sites. We use these CpG sites to screen DMS(using SAMR package , and  ).Finally we obtain 1,991 DMS and they correspond to 1,452 genes. we use 1,452 genes to do clustering analysis. we use a hierarchical clustering method to connect the average distance matrix and use the Pearson correlation coefficient matrix to obtain the clustering heat map (Figure 2).The row represents genes and the column represents samples, samples from top to bottom as normal samples, IDH_3AD mutation samples, IDH mutation samples, DNMT3A mutation samples.
 

 
Figure 2 Clustering heat map of disease samples and normal samples(x-axis represents probe, y-axis represents the type of mutation) 


As can be seen from Figure 2, DNA methylation levels can really separate the different types of sample well, compared with normal samples, mutation samples have a clear ultrahigh methylation level. Furthermore, most DNA methylation value of disease samples exceeded 0.5.

Finally, doing T test statistical analysis for differentially methylated profile by using MeV v4.9 software, the results suggest that 98% of the genes are significant(        ), indicating that the methylation level of disease samples and normal samples have a significant difference. Figure 3 is a T test chart. The row represents samples and the column represents T test value. Figure 3 suggests that there are significant differences between the disease samples and normal samples (        ).
 

 
Figure 3 T test chart(x-axis represents the type of mutation, y-axis represents T test value) 


2.2 Relationship between DNA methylation and expression of different genomic regions
In the screened 1,452 genes, 1,235 gene expression values can be found in disease and normal samples. These 1,235 genes classified into the following two groups:(1) In the gene body. (2) Upstream 2 kb of the transcription start site.1,012 genes fall on the gene body and 201 genes fall on the upstream 2kb of transcription start site. In order to analyze the relationship between DNA methylation and expression, firstly, for the gene body, we calculate the mean DNA methylation value and the mean expression value among samples, then the disease samples and normal samples are put together, make two variables (DNA methylation and expression ) correlation analysis using SPSS 19.0.Test results are shown in Table 3:
 

 
Table 3 Correlation analysis table 


As can be seen from Table 3, p value of Pearson, Kendall, Spearman correlation coefficient was less than 0.01. It suggests that DNA methylation and expression in the gene body have a weak positive correlation at 0.01 (unilateral) level.

The 1,012 genes in the gene body are separated from disease group and normal group and they are drawn  correlation analysis diagram with R(R is a free software environment for statistical computing and graphics). As shown in Figure 4:
 

 
Figure 4 The correlation graph between DNA methylation and gene expression in the gene body (just look at two small map: The first line of the second column shows the relationship between DNA methylation and gene expression in cancer, The first line of the fourth column shows the relationship between DNA methylation and expression in normal samples) 


As can be seen from Figure 4, in the gene body, DNA methylation and gene expression of disease samples have a weak positive correlation. But hypermethylation of the normal samples tends to low expression.

In the promoter, we use genes of fall into the promoter and screen differentially methylated genes and differentially expression genes using R package. Then we calculate the number of hypermethylated genes with low expression and the number of hypermethylated genes with high expression, drawing relationship diagram between DNA methylation and expression in the promoter, as shown in Figure 5:
 

 
Figure 5 The correlation graph between DNA methylation and expression in the promoter 


As can be seen from Figure 5, in the promoter, the number of hypermethylation genes with low expression is significantly more than the number of hypermethylation genes with high expression. It can be inferred that, in the promoter region, hypermethylation sites prefer low expression.

2.3 Enrichment analysis results
We use the DAVID software to do functional enrichment analysis for screened 1,452 genes, enriched to the function directory and GO biological process results shown in Table 4 below:
 

 
Table 4 Function enrichment analysis 


We enriched to the KEGG pathway for differentially methylated genes, the results are shown in Table 5:

It can be seen in Table 4 and Table 5, differentially methylated genes are encoding the phosphoprotein and calcium. It played the main function of chromosomal rearrangements, cell adhesion, leukocyte activation and also played synaptic transmission, the process involved in nucleotide metabolism, transcriptional regulation of gene expression. Moreover, these differentially
 

 
 Table 5 Pathway enrichment analysis


methylated genes enriched in the most cancer-related pathways, Focal adhesion, gliomas pathways. Screened differentially methylated genes have played a certain role for cancer development, so it is a way to classify samples using these methylation features.

We identify aberrant DNA methylation sites and functional analysis for these DNA methylation sites and explore the relationship between DNA methylation and acute myeloid leukemia. The purpose is based on DNA methylation to classify samples and do function analysis. It can provide the diagnosis and therapy of acute myeloid leukemia with great help.

3 Discussion
In order to analyze the relationship among genomic mutations, DNA methylation and acute myeloid leukemia, acute myeloid leukemia caused by mutations in the genome should be considered epigenetic changes together. Our advantages are the combination of a variety of screening DMS method (QDMR and SAM).It can more accurately identify DMS. This method can be applied to many types of cancer and many cancer patients can be classified. We may continue to explore to find drug targets. Although DNA methylation in a single cancer types has been a lot of research, but the process of analyzing a variety of cancers is missing. Therefore, we may continue to explore to identify abnormal methylation sites of various cancers organs, doing functional analysis, forecasting oncogenes and finding drug targets. It can provide the diagnosis and therapy of acute myeloid leukemia with great help.

Acknowledgments
The authors thank Dr Yan Zhang and Dr Jianzhong Su’s help and thank Epigenetic Department in Harbin medical university. Thanks for the College of Bioinformatics Science and Technology, Harbin Medical University.

References
Byrd, J.C., et al., Pretreatment cytogenetic abnormalities are predictive of induction success, cumulative incidence of relapse, and overall survival in adult patients with de novo acute myeloid leukemia: results from Cancer and Leukemia Group B (CALGB 8461). Blood, 2002. 100 (13): p. 4325-4336
http://dx.doi.org/10.1182/blood-2002-03-0772

Grimwade, D., et al., The importance of diagnostic cytogenetics on outcome in AML: analysis of 1,612 patients entered into the MRC AML 10 trial. The Medical Research Council Adult and Children's Leukaemia Working Parties. Blood, 1998. 92 (7): p. 2322-3

Gunjan, A. and S. RK, Epigenetic therapy: targeting histones and their modifications in human disease. Future Medicinal Chemistry, 2010. 2 (4): p. 543-548
http://dx.doi.org/10.4155/fmc.10.18

Baylin, S.B., et al., Aberrant patterns of DNA methylation, chromatin formation and gene expression in cancer. Human Molecular Genetics, 2001. 10 (7): p. 687-69
http://dx.doi.org/10.1093/hmg/10.7.687

Baylin, S.B. and P.A. Jones, A decade of exploring the cancer epigenome - biological and translational implications. Nature Reviews Cancer, 2011. 11 (10): p. 726-734
http://dx.doi.org/10.1038/nrc3130

Jost, E., et al., Epimutations mimic genomic mutations of DNMT3A in acute myeloid leukemia. Leukemia, 2014. 28(6): p. 1227-1234
http://dx.doi.org/10.1038/leu.2013.362

Ogawara, Y., et al., NPMC cooperates with mutant IDH2 to induce acute myeloid leukemia. Experimental Hematology, 2013. 41 (8): p. S55

Chen, Z.J. and C.S. Pikaard, Epigenetic silencing of RNA polymerase I transcription: a role for DNA methylation and histone modification in nucleolar dominance. Genes & Development, 1997. 11(16): p. 2124-2136
http://dx.doi.org/10.1101/gad.11.16.2124

Arzenani, M.K. and M.K. Arzenani, GENOMIC DNA METHYLATION IN HEALTH AND DISEASE. Institutionen F02r Molekyl01r Medicin Och Kirurgi, 2009

Vaissi, T., et al., Epigenetic interplay between histone modifications and DNA methylation in gene silencing. Mutation Research, 2008. 659 (1-2): p. 40-48
http://dx.doi.org/10.1016/j.mrrev.2008.02.004

Liu, X., et al., Regulation of microRNAs by epigenetics and their interplay involved in cancer. Journal of Experimental & Clinical Cancer Research, 2013. 32(11): p. 2941-2950
http://dx.doi.org/10.1186/1756-9966-32-96

Issa, J.P., DNA methylation as a therapeutic target in cancer. Clinical Cancer Research An Official Journal of the American Association for Cancer Research, 2007. 13(6): p. 163
http://dx.doi.org/10.1158/1078-0432.CCR-06-2076

Esteller, M., Cancer Epigenetics for the 21st Century. Genes Cancer, 2011. 2(6): p.: 604-606
http://dx.doi.org/10.1177/1947601911423096

Gerda, E., et al., Epigenetics in human disease and prospects for epigenetic therapy. Nature, 2004. 429(11): p. 457-46

Toyota, M., et al., CpG island methylator phenotype in colorectal cancer. Proc Natl Acad Sci U S A, 1999. 96(15): p. 8681-8686
http://dx.doi.org/10.1073/pnas.96.15.8681

M, T., et al., Aberrant methylation in gastric cancer associated with the CpG island methylator phenotype. Cancer Research, 1999. 59 (21): p. 5438-5442

Witte, T., C. Plass, and C. Gerhauser, Pan-cancer patterns of DNA methylation. Genome Medicine, 2014. 6 (9): p. 798-801
http://dx.doi.org/10.1186/s13073-014-0066-6

Felicitas, T., et al., Incidence and prognostic influence of DNMT3A mutations in acute myeloid leukemia. Journal of Clinical Oncology Official Journal of the American Society of Clinical Oncology, 2011. 29 (21): p. 2889-96
http://dx.doi.org/10.1200/JCO.2011.35.4894

Stoll, D. and R. Akbani, Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. New England Journal of Medicine, 2013. 368(22): p. 2059-2074
http://dx.doi.org/10.1056/NEJMoa1301689

Walter, M.J., et al., Recurrent DNMT3A mutations in patients with myelodysplastic syndromes. Leukemia, 2011. 25 (7): p. 1153-1158
http://dx.doi.org/10.1038/leu.2011.44

Xiao-Jing, Y., et al., Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nature Genetics, 2011. 43 (4): p. 309-315
http://dx.doi.org/10.1038/ng.788

Ribeiro, A.F., et al., Mutant DNMT3A: a marker of poor prognosis in acute myeloid leukemia. Blood, 2012. 119 (24): p. 5824-5831
http://dx.doi.org/10.1182/blood-2011-07-367961

Wen, H., et al., Metabolomic comparison between cells over‐expressing isocitrate dehydrogenase 1 and 2 mutants and the effects of an inhibitor on the metabolism. Journal of Neurochemistry, 2015. 132 (2): p. 183-193
http://dx.doi.org/10.1111/jnc.12950

Figueroa, M.E., et al., Leukemic IDH1 and IDH2 mutations result in a hypermethylation phenotype, disrupt TET2 function, and impair hematopoietic differentiation. Cancer Cell, 2010. 18 (6): p. 553–567
http://dx.doi.org/10.1016/j.ccr.2010.11.015

Yan, Z., et al., QDMR: a quantitative method for identification of differentially methylated regions by entropy. Nucleic Acids Research, 2011. 39 (9): p.: e58
http://dx.doi.org/10.1093/nar/gkr053 
 

Cancer Genetics and Epigenetics
• Volume 3
View Options
. PDF(0KB)
. FPDF(win)
. HTML
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. Ci C.
. Wang Y.H.
. Gu  Y.
. Su  J.Z.
Related articles
. Acute myeloid leukemia
. DNA methylation
. DNMT3A
. IDH
Tools
. Email to a friend
. Post a comment